************************************************************************************************************************
* Analysis Do-File for 
*
* "Electoral Integrity Analysis"
* Organization of American States, Irfan Nooruddin, November 2019
*
* This version accurate as of 26 August 2020
*
* Please communicate errors identified to in62@georgetown.edu. Thanks in advance for your help.
************************************************************************************************************************

/*
DESCRIPTION OF DATASET
This replication file replicates and redoes the analysis presented on pages 86-94 of the official OAS 
electoral analysis of the 2019 Bolivian Presidential Election, which can be found here:
https://www.oas.org/fpdb/press/Audit-Report-EN-vFINAL.pdf

MODELING NOTES 19 AUGUST 2020:
1) Idrobo et al. (2020) correctly state that I should use a local linear regression smoother 
(lpoly with degree 1) instead of a local-mean smoother (lpoly with degree 0). My initial 
analysis used a lowess running means smoother (lowess ..., mean...). Below, I adopt the 
suggestion from Idrobo et al. MY FINDINGS ARE UNCHANGED.

2) They also suggest that I incorrectly omitted ~4% of the polling stations. This is INCORRECT. 
I address this concern below.

ERRATUM NOTE 25 AUGUST 2020: 
On 24 August 2020, researchers with CEPR in Washington, DC, identified a mistake in how the 
cumulative vote count using the Computo timestamps was calculated. I thank them for 
their feedback. The dataset now includes a corrected cumulative vote count indicator and 
re-calculates all the relevant figures and table-cell entries (original on pages 92-93 of the OAS
report). MY CONCLUSIONS ARE UNCHANGED.
*/


*******************************************
*******************************************
*******************************************// BEGIN 
*******************************************
*******************************************

set more off

clear // Clear Stata

use "nooruddin.bolivia 2019 oas analysis replication data.dta" // Load replication data set


***********************************************************************************
***** Quick Housekeeping
***************************************************************************

desc num_mesa_trep num_mesa_computo
codebook num_mesa_trep num_mesa_computo
corr num_mesa_trep num_mesa_computo

// 34,555 total polling stations
// Polling station numbers are the same in the TREP and COMPUTO datasets. which I use to merge the two
// datasets. This allows me to use the COMPUTO vote tallies alongside the TREP time stamps. There are 
// 1,511 polling stations (mesas) that are recorded in the COMPUTO data set that are not part of the 
// TREP data set. 

tab _merge

/* 
tab _merge

                 _merge |      Freq.     Percent        Cum.
------------------------+-----------------------------------
         using only (2) |      1,511        4.37        4.37
            matched (3) |     33,044       95.63      100.00
------------------------+-----------------------------------
                  Total |     34,555      100.00
*/


tab1 masdiff ccdiff
*gen masdiff = mas_computo - mas_trep
*gen ccdiff = cc_computo - cc_trep

// Checks the difference in reported votes between the TREP and COMPUTO for the two major parties. There are some 
// changes but very few. For the CC, 99.44% of polling stations have no changes in vote count. For MAS, 99.46% of 
// polling stations have no changes.

summ masdifftotal ccdifftotal
*egen masdifftotal = total(masdiff)
*egen ccdifftotal = total(ccdiff)

// Adds up the total change in votes for each party from TREP to COMPUTO. 

// I make the assumption that the 1,511 polling stations that are only in COMPUTO (_merge==2) but not in TREP can 
// be treated as "late reporters" and append them to the very end of the cumulative count.
//
// Page 86 of OAS report makes this clear, stating: "1,511 polling stations were not included in the TREP 
// but do appear in the final computo results, which are the official vote tallies of the Bolivian system. 
// All the analysis conducted below include these additional polling stations. Since they were not included 
// in the TREP, they are treated as being late reporters. We stress that all the results below are based on 
// the computo vote tallies. The overall conclusions do not change depending on whether we use the TREP or 
// computo time stamps, though the shape of the trend lines do, since the [TREP and Computo] time stamps are 
// not perfectly correlated."

*******************************************************
// Timestamps                                         * 
*******************************************************

// For TREP Data Set
desc verificador_date
*sort verificador_date
*gen cum_ps_natl_share = sum(ps_natl_share)
codebook cum_ps_natl_share
//Cumulative vote count using TREP timestamps only (n=33,044; 1,511 polling stations missing since only included in Computo and lack a TREP timestamp)


// For Computo Data Set
// Note that this is CORRECTED as of August 24, 2020. I thank David Rosnick and colleages at CEPR in Washington, DC,
// for identifying a mistake in the original calculation.
desc NEWComputoDate
*sort NEWComputoDate num_mesa_computo // Multiple polling stations share the same HH:MM time-stamp so adding a second sorting layer promotes replicability of findings
*gen NEWcum_ps_natl_share_computo = sum(ps_natl_share_computo)
codebook NEWcum_ps_natl_share_computo
// CORRECTED cumulative vote count using computo vote tallies and using Computo timestamps


****************************************************************************
**** Code to generate other relevant variables of interest              ****
**** Note that all calculations are made with only Computo vote tallies ****
****************************************************************************

// Calculate polling station level voter turnout rate and valid vote share
*gen ps_turnout_rate_computo = (emitidos_computo / inscritos_computo) * 100 // Votes cast as share of registered votes
*gen ps_valid_vote_share_computo = ((validos_computo + blancos_computo) / emitidos_computo) * 100 // Valid votes as share of votes cast (i.e., excludes null notes)

// Calculate polling station level party vote shares
*gen cc_share_computo = (cc_computo / emitidos_computo) * 100
*gen fpv_share_computo = (fpv_computo / emitidos_computo) * 100
*gen mts_share_computo = (mts_computo / emitidos_computo) * 100
*gen ucs_share_computo = (ucs_computo / emitidos_computo) * 100
*gen mas_share_computo = (mas_computo / emitidos_computo) * 100
*gen x21f_share_computo = (_21F_computo / emitidos_computo) * 100
*gen pdc_share_computo = (pdc_computo / emitidos_computo) * 100
*gen mnr_share_computo = (mnr_computo / emitidos_computo) * 100
*gen pan_bol_share_computo = (PAN_BOL_computo / emitidos_computo) * 100




***************************************************************************
*****REPLICATING FIGURE ON PAGE 87 of OAS REPORT
// While Mr Morales began to out-perform Mr Mesa early on, leading to his 7.29% margin with 84% of the
// vote counted, the graph above shows that the trends for both parties change after that point. This
// divergence grows even sharper after the 95% mark.
***************************************************************************


twoway (lowess cc_share_computo cum_ps_natl_share, mean bwidth(0.2) lcolor(green) lwidth(vthick) lpattern(dash) /*
*/ legend(label(1 "Civic Community")) graphr(color(white)) lstyle(none)) /*
*/ (lowess mas_share_computo cum_ps_natl_share, mean bwidth(0.2) lcolor(red) lwidth(vthick) /*
*/ legend(label(2 "MAS"))), yline(50, lcolor(black)) xscale(extend nofextend) xline(0.84 0.95, lcolor(black)) /*
*/ xlabel(0 0.84 0.95 1) title("Bolivia Presidential Election 2019") /*
*/ ytitle("Polling Station Level Vote Share by Party") /*
*/ xtitle("Cumulative National Vote Share Counted")


***************************************************************************
*****REPLICATING FIGURE AT TOP OF PAGE 88 of OAS REPORT
//This figure plots the polling-station-level CC vote share using the Computo vote tallies but
//with the TREP time stamps only. The 1,511 polling stations that were not included in the TREP
//data are therefore excluded.
***************************************************************************

// Using a lowess "running means" smoother as reported in the OAS report
twoway (scatter cc_share_computo cum_ps_natl_share if cum_ps_natl_share<=1, sort mcolor(gray%60) msymbol(point)) /*
*/ (lowess cc_share_computo cum_ps_natl_share if cum_ps_natl_share<0.95, mean bwidth(0.3) lcolor(green) lwidth(vthick)) /*
*/ (lowess cc_share_computo cum_ps_natl_share if cum_ps_natl_share>0.95&cum_ps_natl_share<=1, mean bwidth(0.6) lcolor(red) lwidth(vthick)), /*
*/ yline(50, lcolor(black)) xscale(extend nofextend) xline(0.95, lcolor(black)) xlabel(0 0.95 1) leg(off) graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") xtitle("Cumulative National Vote Share Counted  in TREP Data Set (n=33,044)") ytitle("Polling Station Level CC Vote Share")

// Using a local linear regression smoother (lpoly...., deg(1)....)  per Idrobo et al. (2020)
// Kink still apparent
twoway (scatter cc_share_computo cum_ps_natl_share if cum_ps_natl_share<=1, sort mcolor(gray%60) msymbol(point)) /*
*/ (lpoly cc_share_computo cum_ps_natl_share if cum_ps_natl_share<0.95, deg(1) bwidth(0.3) lcolor(green) lwidth(vthick)) /*
*/ (lpoly cc_share_computo cum_ps_natl_share if cum_ps_natl_share>0.95&cum_ps_natl_share<=1, deg(1) bwidth(0.6) lcolor(red) lwidth(vthick)), /*
*/ yline(50, lcolor(black)) xscale(extend nofextend) xline(0.95, lcolor(black)) xlabel(0 0.95 1) leg(off) graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") xtitle("Cumulative National Vote Share Counted in TREP Data Set (n=33,044)") ytitle("Polling Station Level CC Vote Share")


***************************************************************************
*****REPLICATING FIGURE AT BOTTOM OF PAGE 88 of OAS REPORT
//This figure plots the polling-station-level MAS vote share using the Computo vote tallies but
//with the TREP time stamps only. The 1,511 polling stations that were not included in the TREP
//data are therefore excluded.
***************************************************************************

// Using a lowess "running means" smoother as reported in the OAS report
twoway (scatter mas_share_computo cum_ps_natl_share if cum_ps_natl_share<=1, sort mcolor(gray%60) msymbol(point)) /*
*/ (lowess mas_share_computo cum_ps_natl_share if cum_ps_natl_share<0.95, mean bwidth(0.3) lcolor(green) lwidth(vthick)) /*
*/ (lowess mas_share_computo cum_ps_natl_share if cum_ps_natl_share>0.95&cum_ps_natl_share<=1, mean bwidth(0.6) lcolor(red) lwidth(vthick)), /*
*/ yline(50, lcolor(black)) xscale(extend nofextend) xline(0.95, lcolor(black)) xlabel(0 0.95 1) leg(off) graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") xtitle("Cumulative National Vote Share Counted  in TREP Data Set (n=33,044)") ytitle("Polling Station Level MAS Vote Share")

// Using a local linear regression smoother (lpoly...., deg(1)....)  per Idrobo et al. (2020)
// Kink still apparent
twoway (scatter mas_share_computo cum_ps_natl_share if cum_ps_natl_share<=1, sort mcolor(gray%60) msymbol(point)) /*
*/ (lpoly mas_share_computo cum_ps_natl_share if cum_ps_natl_share<0.95, deg(1) bwidth(0.3) lcolor(green) lwidth(vthick)) /*
*/ (lpoly mas_share_computo cum_ps_natl_share if cum_ps_natl_share>0.95&cum_ps_natl_share<=1, deg(1) bwidth(0.6) lcolor(red) lwidth(vthick)), /*
*/ yline(50, lcolor(black)) xscale(extend nofextend) xline(0.95, lcolor(black)) xlabel(0 0.95 1) leg(off) graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") xtitle("Cumulative National Vote Share Counted in TREP Data Set (n=33,044)") ytitle("PS-Level MAS Vote Share") 


***************************************************************************
*****ADDRESSING THE MISSING 1,511 POLLING STATIONS IN THE FIGURES ABOVE
/* 1,511 polling stations were only included in the Computo data set but were not
included in the TREP data. Therefore they do not have TREP timestamps, and so are
excluded from the figures above. Idrobo et al. (2020) argue that this is a 
mistake and that including them changes the result. I show below that this is not 
the case.

The question is how to include these polling stations given that they do not have 
a timestamp for the TREP dataset. An arbitrary way is to treat them all as 
"late-reporting" and to add them to the end of the TREP count. 

Doing so adds ~4% of observations to the end of the TREP data set. NOTE THAT THE
BREAK POINT IDENTIFIED ABOVE IS AT 95% OF THE ORIGINAL TREP DATA SET. Once we
add these additional 4%
*/
***************************************************************************

summ cum_ps_natl_share
recode cum_ps_natl_share .=1.04
summ cum_ps_natl_share

// Using a local linear regression smoother (lpoly...., deg(1)....)  per Idrobo et al. (2020)
// The "break" at the original 95% point identified above is apparent, but the post-break smoother is trying to
// account for the 1,511 observations all sitting at the 1.04 mark and so the line is downward sloping. But 
// the break identified above is still clearly there.

twoway (scatter mas_share_computo cum_ps_natl_share if cum_ps_natl_share<=1.05, sort mcolor(gray%60) msymbol(point)) /*
*/ (lpoly mas_share_computo cum_ps_natl_share if cum_ps_natl_share<0.95, deg(1) bwidth(0.3) lcolor(green) lwidth(vthick)) /*
*/ (lpoly mas_share_computo cum_ps_natl_share if cum_ps_natl_share>0.95&cum_ps_natl_share<=1.11, deg(1) bwidth(0.6) lcolor(red) lwidth(vthick)), /*
*/ yline(50, lcolor(black)) xscale(extend nofextend) xline(0.95, lcolor(black)) xlabel(0 0.95 1 1.04 "C") leg(off) graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") xtitle("Cumulative National Vote Share Counted (TREP + Computo-only mesas)") ytitle("PS-Level MAS Vote Share") 

recode cum_ps_natl_share 1.04=. // Returning the variable to original state

***************************************************************************
******REPLICATING THE FACTS REPORTED ON PAGE 89
***************************************************************************

* CLAIM: "With 95% of the vote counted per the TREP time stamps, the margin was still less than 10%"
total(emitidos_computo) if cum_ps_natl_share<=0.95 // emitidos_computo = validos_computo + blancos_computo
total(validos_computo) if cum_ps_natl_share<=0.95 // valid votes only
total(mas_computo) if cum_ps_natl_share<=0.95
total(cc_computo) if cum_ps_natl_share<=0.95
display 2585145/5599995 // MAS Vote Share with 95% of TREP time stamps = 46.16% -- NOTE text has an error and says 43.16%
display 2095215/5599995 // CC Vote Share with 95% of TREP time stamps = 37.41% -- NOTE text has an error and says 34.98%
display 2585145-2095215 // MAS had an advantage of 489,930 votes -- NOTE text has an error and says 489,963 (33 votes difference)
display (2585145/5599995) - (2095215/5599995) // MAS had an advantage of 8.7 points over CC at this point, which is less than 10%****


* CLAIM: "With 95% of the votes counted in the TREP, Morales gained a lead of 488,891 votes (8.7%)"
display (2585145/5599995) - (2095215/5599995) // MAS had an advantage of 8.7 points over CC****


* CLAIM: "In the final 5% of the TREP alone, Morales added another 106,799 votes to his lead -- out of 290,624 total votes cast 
* which pushed his overall margin to 10.11%"
total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1 // 290,624 votes cast
total(mas_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1 // MAS received 176,189 of these votes
total(cc_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1 // CC received 69,390 of these votes
display 176189-69390 // Morales added 106,799 votes to his lead


* CLAIM: "If we consider only the polling stations not included in TREP, then Morales obtained 128,025 votes out of 247,025, while 
* Mesa obtained 76,315 (51,710 votes less than Morales)."
total(validos_computo) if cum_ps_natl_share>1 // 247,159 votes cast -- NOTE text has an error and says 247,025
total(mas_computo) if cum_ps_natl_share>1 // MAS received 128,025 of these votes
total(cc_computo) if cum_ps_natl_share>1 // CC received 76,315 of these votes
display 128025-76315 // Morales added 51,710 votes to his lead


* CLAIM: "This  means that of the overall margin of victory of just under 650,000 votes, over 156,000 came in just the final 5% of 
* the vote count"
total(validos_computo) // total of 6,137,778 valid votes cast
total(mas_computo) // MAS received 2,889,359 total votes
total(cc_computo) // CC received 2,240,920 total votes
display 2889359 - 2240920 // Overall margin of victory was 648,439 votes
display 106799 + 51710 // 158,509 votes were added to the margin in the final 5% of TREP plus 1511 stations not included in TREP


* CLAIM: "The TREP data reports results from 33,044 polling stations altogether. Of these, 31,379 reported their results before the 95% cumulative vote 
* count threshold; 1,665 polling stations reported after."

codebook num_mesa_trep if cum_ps_natl_share~=. // 33044 polling stations in TREP data
codebook num_mesa_trep if cum_ps_natl_share<=0.95 // 31379 polling stations reported before 95% cumulative vote count threshold in TREP
codebook num_mesa_trep if cum_ps_natl_share>0.95&cum_ps_natl_share~=. // 1665 polling stations reported after 95% cumulative vote count threshold in TREP
codebook num_mesa_trep if cum_ps_natl_share==. // 1511 polling stations not included in TREP, but included in COMPUTO


* CLAIM: "Of the late-reporting polling stations, the bulk were in one of seven departments in Bolivia. These are Beni
* (92), Chuquisaca (74), Cochabamba (541), La Paz (294), Potosi (215), Santa Cruz (184), and Tarija (115),
* which together account for 1,515 or 94% of the late-reporting polling stations with the numbers in
* parentheses indicating the number of polling-stations in each department that were part of the last 5%
* of the vote count."

total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Beni" // 92
total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Chuquisaca" // 74
total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Cochabamba" // 539 -- NOTE text has an error and says 541
total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="La Paz" //  294
total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="PotosÃ­" // 215
total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Santa Cruz" // 184
total(validos_computo) if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Tarija" // 115


***************************************************************************
*****REPLICATING TABLE AT TOP OF PAGE 90
/* While the last figure shows that adding the 1,511 computo-only polling stations
into the TREP data set does not alter my finding of a break, the Tables on page 90 
make this even clearer.
*/
***************************************************************************


******Before 95% cumulative votes counted in TREP**************

****** Column 1: "Votes Cast" ******

total(validos_computo) if cum_ps_natl_share<=0.95
total(validos_computo) if cum_ps_natl_share<=0.95&dep=="Beni"
total(validos_computo) if cum_ps_natl_share<=0.95&dep=="Chuquisaca"
total(validos_computo) if cum_ps_natl_share<=0.95&dep=="Cochabamba"
total(validos_computo) if cum_ps_natl_share<=0.95&dep=="La Paz"
total(validos_computo) if cum_ps_natl_share<=0.95&dep=="PotosÃ­"
total(validos_computo) if cum_ps_natl_share<=0.95&dep=="Santa Cruz"
total(validos_computo) if cum_ps_natl_share<=0.95&dep=="Tarija"

****** Column 2: "MAS Votes" ******

total(mas_computo) if cum_ps_natl_share<=0.95
total(mas_computo) if cum_ps_natl_share<=0.95&dep=="Beni"
total(mas_computo) if cum_ps_natl_share<=0.95&dep=="Chuquisaca"
total(mas_computo) if cum_ps_natl_share<=0.95&dep=="Cochabamba"
total(mas_computo) if cum_ps_natl_share<=0.95&dep=="La Paz"
total(mas_computo) if cum_ps_natl_share<=0.95&dep=="PotosÃ­"
total(mas_computo) if cum_ps_natl_share<=0.95&dep=="Santa Cruz"
total(mas_computo) if cum_ps_natl_share<=0.95&dep=="Tarija"


****** Column 3: "CC Votes" ******

total(cc_computo) if cum_ps_natl_share<=0.95
total(cc_computo) if cum_ps_natl_share<=0.95&dep=="Beni"
total(cc_computo) if cum_ps_natl_share<=0.95&dep=="Chuquisaca"
total(cc_computo) if cum_ps_natl_share<=0.95&dep=="Cochabamba"
total(cc_computo) if cum_ps_natl_share<=0.95&dep=="La Paz"
total(cc_computo) if cum_ps_natl_share<=0.95&dep=="PotosÃ­"
total(cc_computo) if cum_ps_natl_share<=0.95&dep=="Santa Cruz"
total(cc_computo) if cum_ps_natl_share<=0.95&dep=="Tarija"


****** Columns 4 "MAS %" and 5 "CC %" are calculated manually using the calculations above.


*********Last 5% of TREP and Mesas in Computo but not in TREP***********

****** Column 6: "Votes Cast" ******

total(validos_computo) if cum_ps_natl_share>0.95
total(validos_computo) if cum_ps_natl_share>0.95&dep=="Beni"
total(validos_computo) if cum_ps_natl_share>0.95&dep=="Chuquisaca"
total(validos_computo) if cum_ps_natl_share>0.95&dep=="Cochabamba"
total(validos_computo) if cum_ps_natl_share>0.95&dep=="La Paz"
total(validos_computo) if cum_ps_natl_share>0.95&dep=="PotosÃ­" // 33,576 polling stations in last 5% of TREP with name "PotosÃ­"
total(validos_computo) if cum_ps_natl_share>0.95&dep=="Potosí" // 25,214 polling stations in computo only but not in TREP with name "Potosi" 
total(validos_computo) if cum_ps_natl_share>0.95&dep=="Santa Cruz"
total(validos_computo) if cum_ps_natl_share>0.95&dep=="Tarija"

****** Column 7: "MAS Votes" ******

total(mas_computo) if cum_ps_natl_share>0.95
total(mas_computo) if cum_ps_natl_share>0.95&dep=="Beni"
total(mas_computo) if cum_ps_natl_share>0.95&dep=="Chuquisaca"
total(mas_computo) if cum_ps_natl_share>0.95&dep=="Cochabamba"
total(mas_computo) if cum_ps_natl_share>0.95&dep=="La Paz"
total(mas_computo) if cum_ps_natl_share>0.95&dep=="PotosÃ­" // 19,534 MAS votes in last 5% of TREP with name "PotosÃ­"
total(mas_computo) if cum_ps_natl_share>1&dep=="Potosí" // 13,521 MAS votes in computo only but not in TREP with name "Potosi"
total(mas_computo) if cum_ps_natl_share>0.95&dep=="Santa Cruz"
total(mas_computo) if cum_ps_natl_share>0.95&dep=="Tarija"


****** Column 8: "CC Votes" ******

total(cc_computo) if cum_ps_natl_share>0.95
total(cc_computo) if cum_ps_natl_share>0.95&dep=="Beni"
total(cc_computo) if cum_ps_natl_share>0.95&dep=="Chuquisaca"
total(cc_computo) if cum_ps_natl_share>0.95&dep=="Cochabamba"
total(cc_computo) if cum_ps_natl_share>0.95&dep=="La Paz"
total(cc_computo) if cum_ps_natl_share>0.95&dep=="PotosÃ­" // 8,416 CC votes in last 5% of TREP with name "PotosÃ­"
total(cc_computo) if cum_ps_natl_share>1&dep=="Potosí" // 7,146 CC votes in computo only but not in TREP with name "Potosi"
total(cc_computo) if cum_ps_natl_share>0.95&dep=="Santa Cruz"
total(cc_computo) if cum_ps_natl_share>0.95&dep=="Tarija"


****** Columns 9 "MAS %" and 10 "CC %" are calculated manually using the calculations above.



***************************************************************************
*****REPLICATING TABLE AT BOTTOM OF PAGE 90
/* While the last figure shows that adding the 1,511 computo-only polling stations
into the TREP data set does not alter my finding of a break, the Tables on page 90 
make this even clearer.
*/
***************************************************************************

** Column 1: PS Level MAS vote share between 0-95%**

summ mas_share_computo if cum_ps_natl_share<=0.95
summ mas_share_computo if cum_ps_natl_share<=0.95&dep=="Beni"
summ mas_share_computo if cum_ps_natl_share<=0.95&dep=="Chuquisaca"
summ mas_share_computo if cum_ps_natl_share<=0.95&dep=="Cochabamba"
summ mas_share_computo if cum_ps_natl_share<=0.95&dep=="La Paz"
summ mas_share_computo if cum_ps_natl_share<=0.95&dep=="PotosÃ­" 
summ mas_share_computo if cum_ps_natl_share<=0.95&dep=="Santa Cruz" 
summ mas_share_computo if cum_ps_natl_share<=0.95&dep=="Tarija" 

** Column 2: PS Level MAS vote share between 0-95%**

summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1 
summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Beni"
summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Chuquisaca"
summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Cochabamba"
summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="La Paz"
summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="PotosÃ­"
summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Santa Cruz"
summ mas_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Tarija"


** Column 3: 'Computo Only' column for MAS vote share**
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=. // NOTE: Value is 49.548 so entry could have been 49.5 instead of 49.6
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Beni"
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Chuquisaca"
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Cochabamba"
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="La Paz"
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Potosí"
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Santa Cruz" // NOTE: Value is 37.049 so entry could have been 37.0 instead of 37.1
summ mas_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Tarija"


** Column 4: CC vote share between 0-95%**
summ cc_share_computo if cum_ps_natl_share<=0.95
summ cc_share_computo if cum_ps_natl_share<=0.95&dep=="Beni"
summ cc_share_computo if cum_ps_natl_share<=0.95&dep=="Chuquisaca"
summ cc_share_computo if cum_ps_natl_share<=0.95&dep=="Cochabamba"
summ cc_share_computo if cum_ps_natl_share<=0.95&dep=="La Paz"
summ cc_share_computo if cum_ps_natl_share<=0.95&dep=="PotosÃ­"
summ cc_share_computo if cum_ps_natl_share<=0.95&dep=="Santa Cruz"
summ cc_share_computo if cum_ps_natl_share<=0.95&dep=="Tarija"


** Column 5: CC vote share between 95%-100%**
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Beni"
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Chuquisaca"
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Cochabamba"
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="La Paz"
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="PotosÃ­"
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Santa Cruz"
summ cc_share_computo if cum_ps_natl_share>=0.95&cum_ps_natl_share<=1&dep=="Tarija" 


** Column 6: 'Computo Only' column for CC vote share**
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Beni"
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Chuquisaca"
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Cochabamba"
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="La Paz"
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Potosí"
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Santa Cruz"
summ cc_share_computo if cum_ps_natl_share==.&NEWcum_ps_natl_share_computo~=.&dep=="Tarija"


recode cum_ps_natl_share 1.04 = .
sum cum_ps_natl_share // 33,044 polling stations that were in the TREP data set only
// RETURNS THIS VARIABLE TO INITIAL SETTING -- Recodes the polling stations that are only in 
// the computo data set as missing in the TREP data set


***************************************************************************
/* COMPUTO TIME STAMPS ANALYSIS
Note again that there was an error in the way the original cumulative vote count variable 
based on the computo timestamps was calculated. Everything below uses a corrected 
variable based on 'NEWComputoDate'

--> ALL THE RESULTS OF MY ORIGINAL ANALYSIS HOLD
*/
*****************************************************************************


***************************************************************************
****** RE-DRAWNG FIGURE IN MIDDLE OF PAGE 91
***************************************************************************

// Using a running weighted lines smoother (lowess)
twoway (lowess cc_share_computo NEWcum_ps_natl_share_computo, bwidth(0.2) lcolor(green) lwidth(vthick) lpattern(dash) /*
*/ legend(label(1 "Civic Community")) graphr(color(white)) lstyle(none)) /*
*/ (lowess mas_share_computo NEWcum_ps_natl_share_computo, bwidth(0.2) lcolor(red) lwidth(vthick) legend(label(2 "MAS"))), /*
*/ yline(50, lcolor(black)) xscale(extend nofextend) xlabel(0 1) /*
*/ title("Bolivia Presidential Election 2019") ytitle("Polling Station Level Vote Share by Party") xtitle("CORRECTED Cumulative National Vote Share Counted in Computo")


***************************************************************************
****** RE-DRAWING THE FIGURES ON PAGE 92
***************************************************************************

/* The figures below are based on the Computo Time Stamps which exist for all 34,555 polling stations. 
The analysis above makes clear that the break identified was with 5% of the count remaining in TREP-only data set. We 
have now added another 4% of cases that were only in the Computo. Therefore, the relevant break-point is with 
about 9% (5+4) or 0.91 cumulative vote count in the computo data set.
*/

****** REPLICATING FIGURE AT TOP OF PAGE 92

// Using a local linear regression smoother (lpoly...., deg(1)....) per Idrobo et al. (2020)
// Kink still apparent
twoway (scatter mas_share_computo NEWcum_ps_natl_share_computo if NEWcum_ps_natl_share_computo<=1, sort mcolor(gray%60) msymbol(point)) /*
*/ (lpoly mas_share_computo NEWcum_ps_natl_share_computo if NEWcum_ps_natl_share_computo<0.91, deg(1) bwidth(0.3) lcolor(green) lwidth(vthick)) /*
*/ (lpoly mas_share_computo NEWcum_ps_natl_share_computo if NEWcum_ps_natl_share_computo>0.91&NEWcum_ps_natl_share_computo<=1, deg(1) bwidth(0.4) /*
*/ lcolor(red) lwidth(vthick)), yline(50, lcolor(black)) xscale(extend nofextend) xline(0.91, lcolor(black)) xlabel(0 0.91 1) leg(off) /*
*/ graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") /*
*/ xtitle("CORRECTED Cumulative National Vote Share Counted in Computo") /*
*/ ytitle("Polling Station Level MAS Vote Share")

twoway (scatter mas_share_computo NEWcum_ps_natl_share_computo if NEWcum_ps_natl_share_computo<=1, sort mcolor(gray%60) msymbol(point)) /*
*/ (lpoly mas_share_computo NEWcum_ps_natl_share_computo if NEWcum_ps_natl_share_computo<0.95, deg(1) bwidth(0.3) lcolor(green) lwidth(thick)) /*
*/ (lpoly mas_share_computo NEWcum_ps_natl_share_computo if NEWcum_ps_natl_share_computo>0.95&NEWcum_ps_natl_share_computo<=1, deg(1) bwidth(0.4) /*
*/ lcolor(red) lwidth(thick)), yline(50, lcolor(black)) xscale(extend nofextend) xline(0.95, lcolor(black)) xlabel(0 0.95 1) leg(off) /*
*/ graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") /*
*/ xtitle("CORRECTED Cumulative National Vote Share Counted in Computo") /*
*/ ytitle("Polling Station Level MAS Vote Share")


****** REPLICATING FIGURE AT BOTTOM OF PAGE 92

// Using a local linear regression smoother (lpoly...., deg(1)....) per Idrobo et al. (2020)
// Kink still apparent
twoway (scatter cc_share_computo NEWcum_ps_natl_share_computo if pais=="Bolivia"&NEWcum_ps_natl_share_computo<=1, sort mcolor(gray%60) msymbol(point)) /*
*/ (lpoly cc_share_computo NEWcum_ps_natl_share_computo if pais=="Bolivia"&NEWcum_ps_natl_share_computo<0.91, deg(1) bwidth(0.3) lcolor(green) lwidth(vthick)) /*
*/ (lpoly cc_share_computo NEWcum_ps_natl_share_computo if pais=="Bolivia"&NEWcum_ps_natl_share_computo>0.91&NEWcum_ps_natl_share_computo<=1, deg(1) bwidth(0.6) /*
*/ lcolor(red) lwidth(vthick)), yline(50, lcolor(black)) xscale(extend nofextend) xline(0.91, lcolor(black)) xlabel(0 0.91 1) leg(off) /*
*/ graphregion(style(none) color(none)) /*
*/ title("Bolivia Presidential Election 2019") /*
*/ xtitle("CORRECTED Cumulative National Vote Share Counted in Computo") /*
*/ ytitle("PS-Level CC Vote Share")



***************************************************************************
*****RE-DOING TABLE AT TOP OF PAGE 93
***************************************************************************

***** PS Level MAS & CC Vote Share for 0-91% & 91-100%******
***** Using "Computo Counts" & "Computo Timestamps"

**** MAS: 0-91%*****
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91
scalar a = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Beni"
scalar a1 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Chuquisaca"
scalar a2 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Cochabamba"
scalar a3 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="La Paz"
scalar a4 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="PotosÃ­"
scalar a5 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Santa Cruz"
scalar a6 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Tarija"
scalar a7 = r(mean)

**** MAS: 91-100%*****

summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1
scalar b = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Beni"
scalar b1 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Chuquisaca"
scalar b2 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Cochabamba"
scalar b3 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="La Paz"
scalar b4 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="PotosÃ­"
scalar b5 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Santa Cruz"
scalar b6 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Tarija"
scalar b7 = r(mean)

**** CC: 0-91%*****
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91
scalar c = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Beni"
scalar c1 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Chuquisaca"
scalar c2 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Cochabamba"
scalar c3 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="La Paz"
scalar c4 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="PotosÃ­"
scalar c5 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Santa Cruz"
scalar c6 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.91&dep=="Tarija"
scalar c7 = r(mean)

**** CC: 91-100%*****

summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1
scalar m = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Beni"
scalar m1 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Chuquisaca"
scalar m2 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Cochabamba"
scalar m3 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="La Paz"
scalar m4 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="PotosÃ­"
scalar m5 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Santa Cruz"
scalar m6 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.91&NEWcum_ps_natl_share_computo<=1&dep=="Tarija"
scalar m7 = r(mean)


******Estimating MAS Advantage Over CC****

*** MAS Advantage Over CC: 0-91% Cumulative Count ***

**National**
scalar e = a - c
display e
**Beni**
scalar e1 = a1 - c1
display e1
**Chuquisaca**
scalar e2 = a2 - c2
display e2
**Cochabamba**
scalar e3 = a3 - c3
display e3
**La Paz**
scalar e4 = a4 - c4
display e4
**PotosÃ­**
scalar e5 = a5 - c5
display e5
** Santa Cruz**
scalar e6 = a6 - c6
display e6
** Tarija**
scalar e7 = a7 - c7
display e7

*** MAS Advantage Over CC: 91-100% Cumulative Count ***

**National**
scalar f = b - m
display f
**Beni**
scalar f1 = b1 - m1
display f1
**Chuquisaca**
scalar f2 = b2 - m2
display f2
**Cochabamba**
scalar f3 = b3 - m3
display f3
**La Paz**
scalar f4 = b4 - m4
display f4
**PotosÃ­**
scalar f5 = b5 - m5
display f5
** Santa Cruz**
scalar f6 = b6 - m6
display f6
** Tarija****
scalar f7 = b7 - m7
display f7


***************************************************************************
*****RE-DOING TABLE AT TOP OF PAGE 93 USING A 95% THRESHOLD
***************************************************************************

***** PS Level MAS & CC Vote Share for 0-95% & 95-100%******
***** Using "Computo Counts" & "Computo Timestamps"
// This table replicates the earlier one which used the TREP time stamps. 

**** MAS: 0-95%*****
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95
scalar a = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Beni"
scalar a1 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Chuquisaca"
scalar a2 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Cochabamba"
scalar a3 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="La Paz"
scalar a4 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="PotosÃ­"
scalar a5 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Santa Cruz"
scalar a6 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Tarija"
scalar a7 = r(mean)

**** MAS: 95-100%*****

summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1
scalar b = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Beni"
scalar b1 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Chuquisaca"
scalar b2 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Cochabamba"
scalar b3 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="La Paz"
scalar b4 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="PotosÃ­"
scalar b5 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Santa Cruz"
scalar b6 = r(mean)
summ mas_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Tarija"
scalar b7 = r(mean)

**** CC: 0-95%*****
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95
scalar c = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Beni"
scalar c1 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Chuquisaca"
scalar c2 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Cochabamba"
scalar c3 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="La Paz"
scalar c4 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="PotosÃ­"
scalar c5 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Santa Cruz"
scalar c6 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo<=0.95&dep=="Tarija"
scalar c7 = r(mean)

**** CC: 95-100%*****

summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1
scalar m = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Beni"
scalar m1 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Chuquisaca"
scalar m2 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Cochabamba"
scalar m3 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="La Paz"
scalar m4 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="PotosÃ­"
scalar m5 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Santa Cruz"
scalar m6 = r(mean)
summ cc_share_computo if NEWcum_ps_natl_share_computo>=0.95&NEWcum_ps_natl_share_computo<=1&dep=="Tarija"
scalar m7 = r(mean)


******Estimating MAS Advantage Over CC****

*** MAS Advantage Over CC: 0-95% Cumulative Count ***

**National**
scalar e = a - c
display e
**Beni**
scalar e1 = a1 - c1
display e1
**Chuquisaca**
scalar e2 = a2 - c2
display e2
**Cochabamba**
scalar e3 = a3 - c3
display e3
**La Paz**
scalar e4 = a4 - c4
display e4
**PotosÃ­**
scalar e5 = a5 - c5
display e5
** Santa Cruz**
scalar e6 = a6 - c6
display e6
** Tarija**
scalar e7 = a7 - c7
display e7

*** MAS Advantage Over CC: 95-100% Cumulative Count ***

**National**
scalar f = b - m
display f
**Beni**
scalar f1 = b1 - m1
display f1
**Chuquisaca**
scalar f2 = b2 - m2
display f2
**Cochabamba**
scalar f3 = b3 - m3
display f3
**La Paz**
scalar f4 = b4 - m4
display f4
**PotosÃ­**
scalar f5 = b5 - m5
display f5
** Santa Cruz**
scalar f6 = b6 - m6
display f6
** Tarija****
scalar f7 = b7 - m7
display f7




*******************************************
*******************************************
*******************************************// END
*******************************************
*******************************************
